Overview

Dataset statistics

Number of variables33
Number of observations800
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.0 MiB
Average record size in memory1.3 KiB

Variable types

Numeric12
Categorical20
Boolean1

Alerts

months_as_customer is highly overall correlated with ageHigh correlation
age is highly overall correlated with months_as_customerHigh correlation
total_claim_amount is highly overall correlated with injury_claim and 4 other fieldsHigh correlation
injury_claim is highly overall correlated with total_claim_amount and 4 other fieldsHigh correlation
property_claim is highly overall correlated with total_claim_amount and 4 other fieldsHigh correlation
vehicle_claim is highly overall correlated with total_claim_amount and 4 other fieldsHigh correlation
number_of_vehicles_involved is highly overall correlated with incident_typeHigh correlation
incident_type is highly overall correlated with total_claim_amount and 5 other fieldsHigh correlation
collision_type is highly overall correlated with total_claim_amount and 4 other fieldsHigh correlation
auto_make is highly overall correlated with auto_modelHigh correlation
auto_model is highly overall correlated with auto_makeHigh correlation
umbrella_limit is highly imbalanced (63.9%)Imbalance
policy_number has unique valuesUnique
capital-gains has 414 (51.7%) zerosZeros
capital-loss has 383 (47.9%) zerosZeros
incident_hour_of_the_day has 43 (5.4%) zerosZeros
injury_claim has 18 (2.2%) zerosZeros
property_claim has 16 (2.0%) zerosZeros

Reproduction

Analysis started2023-08-25 00:12:59.388870
Analysis finished2023-08-25 00:13:22.200302
Duration22.81 seconds
Software versionydata-profiling vv4.3.1
Download configurationconfig.json

Variables

months_as_customer
Real number (ℝ)

HIGH CORRELATION 

Distinct360
Distinct (%)45.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean204.00125
Minimum0
Maximum479
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2023-08-25T07:13:22.297904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile27
Q1115.75
median202
Q3276
95-th percentile428
Maximum479
Range479
Interquartile range (IQR)160.25

Descriptive statistics

Standard deviation114.30895
Coefficient of variation (CV)0.56033458
Kurtosis-0.47826313
Mean204.00125
Median Absolute Deviation (MAD)80.5
Skewness0.34434294
Sum163201
Variance13066.537
MonotonicityNot monotonic
2023-08-25T07:13:22.462541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
254 7
 
0.9%
210 6
 
0.8%
239 6
 
0.8%
257 6
 
0.8%
222 6
 
0.8%
61 6
 
0.8%
140 6
 
0.8%
246 6
 
0.8%
194 6
 
0.8%
289 6
 
0.8%
Other values (350) 739
92.4%
ValueCountFrequency (%)
0 1
 
0.1%
1 3
0.4%
2 2
0.2%
3 2
0.2%
4 2
0.2%
5 1
 
0.1%
6 1
 
0.1%
7 1
 
0.1%
8 3
0.4%
9 2
0.2%
ValueCountFrequency (%)
479 1
0.1%
478 2
0.2%
476 1
0.1%
475 2
0.2%
472 1
0.1%
468 1
0.1%
465 1
0.1%
463 1
0.1%
461 2
0.2%
460 1
0.1%

age
Real number (ℝ)

HIGH CORRELATION 

Distinct45
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39.01625
Minimum20
Maximum64
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2023-08-25T07:13:22.831440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile26
Q132
median38.5
Q344.25
95-th percentile57
Maximum64
Range44
Interquartile range (IQR)12.25

Descriptive statistics

Standard deviation9.1112433
Coefficient of variation (CV)0.23352432
Kurtosis-0.25410327
Mean39.01625
Median Absolute Deviation (MAD)6.5
Skewness0.45435178
Sum31213
Variance83.014754
MonotonicityNot monotonic
2023-08-25T07:13:22.983634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
41 40
 
5.0%
39 39
 
4.9%
43 38
 
4.8%
38 37
 
4.6%
31 33
 
4.1%
34 32
 
4.0%
30 32
 
4.0%
33 31
 
3.9%
40 30
 
3.8%
37 30
 
3.8%
Other values (35) 458
57.2%
ValueCountFrequency (%)
20 1
 
0.1%
21 6
 
0.8%
22 1
 
0.1%
23 5
 
0.6%
24 9
 
1.1%
25 11
 
1.4%
26 19
2.4%
27 20
2.5%
28 25
3.1%
29 28
3.5%
ValueCountFrequency (%)
64 2
 
0.2%
63 2
 
0.2%
62 3
 
0.4%
61 9
1.1%
60 6
0.8%
59 2
 
0.2%
58 4
 
0.5%
57 14
1.8%
56 7
0.9%
55 13
1.6%

policy_number
Real number (ℝ)

UNIQUE 

Distinct800
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean544664.69
Minimum100804
Maximum999435
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2023-08-25T07:13:23.150233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100804
5-th percentile143600.15
Q1330473
median533940.5
Q3757918
95-th percentile952351.7
Maximum999435
Range898631
Interquartile range (IQR)427445

Descriptive statistics

Standard deviation257844.8
Coefficient of variation (CV)0.47340098
Kurtosis-1.1484346
Mean544664.69
Median Absolute Deviation (MAD)218731.5
Skewness0.028276243
Sum4.3573176 × 108
Variance6.648394 × 1010
MonotonicityNot monotonic
2023-08-25T07:13:23.323615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
669501 1
 
0.1%
599174 1
 
0.1%
913337 1
 
0.1%
118236 1
 
0.1%
283414 1
 
0.1%
307447 1
 
0.1%
542245 1
 
0.1%
458829 1
 
0.1%
587498 1
 
0.1%
844007 1
 
0.1%
Other values (790) 790
98.8%
ValueCountFrequency (%)
100804 1
0.1%
104594 1
0.1%
106186 1
0.1%
107181 1
0.1%
108270 1
0.1%
108844 1
0.1%
109392 1
0.1%
110084 1
0.1%
110122 1
0.1%
110143 1
0.1%
ValueCountFrequency (%)
999435 1
0.1%
998192 1
0.1%
996850 1
0.1%
996253 1
0.1%
994538 1
0.1%
993840 1
0.1%
992145 1
0.1%
990998 1
0.1%
990493 1
0.1%
987524 1
0.1%
Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size53.6 KiB
500
277 
1000
268 
2000
255 

Length

Max length4
Median length4
Mean length3.65375
Min length3

Characters and Unicode

Total characters2923
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row500
2nd row2000
3rd row500
4th row500
5th row1000

Common Values

ValueCountFrequency (%)
500 277
34.6%
1000 268
33.5%
2000 255
31.9%

Length

2023-08-25T07:13:23.483864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:23.619241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
500 277
34.6%
1000 268
33.5%
2000 255
31.9%

Most occurring characters

ValueCountFrequency (%)
0 2123
72.6%
5 277
 
9.5%
1 268
 
9.2%
2 255
 
8.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2923
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2123
72.6%
5 277
 
9.5%
1 268
 
9.2%
2 255
 
8.7%

Most occurring scripts

ValueCountFrequency (%)
Common 2923
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2123
72.6%
5 277
 
9.5%
1 268
 
9.2%
2 255
 
8.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2923
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2123
72.6%
5 277
 
9.5%
1 268
 
9.2%
2 255
 
8.7%

policy_annual_premium
Real number (ℝ)

Distinct528
Distinct (%)66.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1257.7175
Minimum433
Maximum2047
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2023-08-25T07:13:23.763766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum433
5-th percentile862
Q11097
median1265
Q31406
95-th percentile1655.45
Maximum2047
Range1614
Interquartile range (IQR)309

Descriptive statistics

Standard deviation241.70324
Coefficient of variation (CV)0.1921761
Kurtosis0.18112122
Mean1257.7175
Median Absolute Deviation (MAD)155.5
Skewness0.019515483
Sum1006174
Variance58420.456
MonotonicityNot monotonic
2023-08-25T07:13:23.955584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1124 6
 
0.8%
1281 6
 
0.8%
1270 4
 
0.5%
1255 4
 
0.5%
1338 4
 
0.5%
1238 4
 
0.5%
1074 4
 
0.5%
1141 4
 
0.5%
1257 4
 
0.5%
1362 4
 
0.5%
Other values (518) 756
94.5%
ValueCountFrequency (%)
433 1
0.1%
484 1
0.1%
538 1
0.1%
617 1
0.1%
625 1
0.1%
664 1
0.1%
671 1
0.1%
708 1
0.1%
719 1
0.1%
722 1
0.1%
ValueCountFrequency (%)
2047 1
0.1%
1969 1
0.1%
1935 1
0.1%
1927 1
0.1%
1922 1
0.1%
1878 1
0.1%
1865 1
0.1%
1863 1
0.1%
1861 1
0.1%
1848 1
0.1%

umbrella_limit
Categorical

IMBALANCE 

Distinct11
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size52.5 KiB
0
639 
6000000
 
46
5000000
 
34
4000000
 
33
7000000
 
23
Other values (6)
 
25

Length

Max length8
Median length1
Mean length2.21
Min length1

Characters and Unicode

Total characters1768
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.2%

Sample

1st row4000000
2nd row0
3rd row4000000
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 639
79.9%
6000000 46
 
5.8%
5000000 34
 
4.2%
4000000 33
 
4.1%
7000000 23
 
2.9%
3000000 11
 
1.4%
8000000 6
 
0.8%
9000000 3
 
0.4%
2000000 3
 
0.4%
-1000000 1
 
0.1%

Length

2023-08-25T07:13:24.187800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0 639
79.9%
6000000 46
 
5.8%
5000000 34
 
4.2%
4000000 33
 
4.1%
7000000 23
 
2.9%
3000000 11
 
1.4%
8000000 6
 
0.8%
9000000 3
 
0.4%
2000000 3
 
0.4%
1000000 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 1606
90.8%
6 46
 
2.6%
5 34
 
1.9%
4 33
 
1.9%
7 23
 
1.3%
3 11
 
0.6%
8 6
 
0.3%
9 3
 
0.2%
2 3
 
0.2%
1 2
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1767
99.9%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1606
90.9%
6 46
 
2.6%
5 34
 
1.9%
4 33
 
1.9%
7 23
 
1.3%
3 11
 
0.6%
8 6
 
0.3%
9 3
 
0.2%
2 3
 
0.2%
1 2
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1768
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1606
90.8%
6 46
 
2.6%
5 34
 
1.9%
4 33
 
1.9%
7 23
 
1.3%
3 11
 
0.6%
8 6
 
0.3%
9 3
 
0.2%
2 3
 
0.2%
1 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1606
90.8%
6 46
 
2.6%
5 34
 
1.9%
4 33
 
1.9%
7 23
 
1.3%
3 11
 
0.6%
8 6
 
0.3%
9 3
 
0.2%
2 3
 
0.2%
1 2
 
0.1%

insured_zip
Real number (ℝ)

Distinct796
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean502376.07
Minimum430104
Maximum620962
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2023-08-25T07:13:24.357691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum430104
5-th percentile433143.95
Q1448692.25
median466704.5
Q3603417.5
95-th percentile617721.9
Maximum620962
Range190858
Interquartile range (IQR)154725.25

Descriptive statistics

Standard deviation72297.681
Coefficient of variation (CV)0.14391147
Kurtosis-1.2617147
Mean502376.07
Median Absolute Deviation (MAD)23117
Skewness0.77465227
Sum4.0190086 × 108
Variance5.2269546 × 109
MonotonicityNot monotonic
2023-08-25T07:13:24.566269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
469429 2
 
0.2%
477695 2
 
0.2%
431202 2
 
0.2%
456602 2
 
0.2%
449421 1
 
0.1%
613945 1
 
0.1%
432896 1
 
0.1%
603248 1
 
0.1%
462377 1
 
0.1%
602670 1
 
0.1%
Other values (786) 786
98.2%
ValueCountFrequency (%)
430104 1
0.1%
430141 1
0.1%
430232 1
0.1%
430380 1
0.1%
430567 1
0.1%
430621 1
0.1%
430632 1
0.1%
430665 1
0.1%
430714 1
0.1%
430832 1
0.1%
ValueCountFrequency (%)
620962 1
0.1%
620869 1
0.1%
620819 1
0.1%
620757 1
0.1%
620737 1
0.1%
620507 1
0.1%
620493 1
0.1%
620473 1
0.1%
620207 1
0.1%
620197 1
0.1%

capital-gains
Real number (ℝ)

ZEROS 

Distinct285
Distinct (%)35.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24541.625
Minimum0
Maximum100500
Zeros414
Zeros (%)51.7%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2023-08-25T07:13:24.777610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q350400
95-th percentile69520
Maximum100500
Range100500
Interquartile range (IQR)50400

Descriptive statistics

Standard deviation27683
Coefficient of variation (CV)1.1280019
Kurtosis-1.2473234
Mean24541.625
Median Absolute Deviation (MAD)0
Skewness0.5069596
Sum19633300
Variance7.6634852 × 108
MonotonicityNot monotonic
2023-08-25T07:13:24.999472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 414
51.7%
68500 4
 
0.5%
51500 4
 
0.5%
63100 3
 
0.4%
51100 3
 
0.4%
45700 3
 
0.4%
58500 3
 
0.4%
46300 3
 
0.4%
45500 3
 
0.4%
59600 3
 
0.4%
Other values (275) 357
44.6%
ValueCountFrequency (%)
0 414
51.7%
800 1
 
0.1%
10000 1
 
0.1%
11000 1
 
0.1%
12100 1
 
0.1%
13100 1
 
0.1%
14100 1
 
0.1%
16100 1
 
0.1%
17600 1
 
0.1%
20200 1
 
0.1%
ValueCountFrequency (%)
100500 1
0.1%
98800 1
0.1%
94800 1
0.1%
91900 1
0.1%
90700 1
0.1%
88400 1
0.1%
87800 1
0.1%
84900 1
0.1%
83900 1
0.1%
82600 1
0.1%

capital-loss
Real number (ℝ)

ZEROS 

Distinct305
Distinct (%)38.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-26499.875
Minimum-111100
Maximum0
Zeros383
Zeros (%)47.9%
Negative417
Negative (%)52.1%
Memory size9.4 KiB
2023-08-25T07:13:25.220388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-111100
5-th percentile-72005
Q1-51025
median-21200
Q30
95-th percentile0
Maximum0
Range111100
Interquartile range (IQR)51025

Descriptive statistics

Standard deviation27953.003
Coefficient of variation (CV)-1.0548353
Kurtosis-1.2917047
Mean-26499.875
Median Absolute Deviation (MAD)21200
Skewness-0.40163292
Sum-21199900
Variance7.8137038 × 108
MonotonicityNot monotonic
2023-08-25T07:13:25.450815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 383
47.9%
-51000 4
 
0.5%
-32800 4
 
0.5%
-53800 4
 
0.5%
-50300 4
 
0.5%
-53700 4
 
0.5%
-64500 3
 
0.4%
-45800 3
 
0.4%
-45300 3
 
0.4%
-61000 3
 
0.4%
Other values (295) 385
48.1%
ValueCountFrequency (%)
-111100 1
0.1%
-91400 1
0.1%
-91200 1
0.1%
-90600 1
0.1%
-90200 1
0.1%
-90100 1
0.1%
-89400 1
0.1%
-88300 1
0.1%
-87300 1
0.1%
-83200 1
0.1%
ValueCountFrequency (%)
0 383
47.9%
-5700 1
 
0.1%
-6300 1
 
0.1%
-8500 1
 
0.1%
-10600 1
 
0.1%
-13200 1
 
0.1%
-13800 1
 
0.1%
-15600 1
 
0.1%
-15700 2
 
0.2%
-15900 1
 
0.1%

incident_hour_of_the_day
Real number (ℝ)

ZEROS 

Distinct24
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.56875
Minimum0
Maximum23
Zeros43
Zeros (%)5.4%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2023-08-25T07:13:25.600087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q16
median12
Q317
95-th percentile23
Maximum23
Range23
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.9039738
Coefficient of variation (CV)0.59677786
Kurtosis-1.1734177
Mean11.56875
Median Absolute Deviation (MAD)6
Skewness-0.016720534
Sum9255
Variance47.664855
MonotonicityNot monotonic
2023-08-25T07:13:25.794008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
0 43
 
5.4%
3 43
 
5.4%
23 42
 
5.2%
10 41
 
5.1%
16 40
 
5.0%
9 39
 
4.9%
17 39
 
4.9%
6 38
 
4.8%
21 36
 
4.5%
13 36
 
4.5%
Other values (14) 403
50.4%
ValueCountFrequency (%)
0 43
5.4%
1 22
2.8%
2 23
2.9%
3 43
5.4%
4 35
4.4%
5 24
3.0%
6 38
4.8%
7 35
4.4%
8 29
3.6%
9 39
4.9%
ValueCountFrequency (%)
23 42
5.2%
22 23
2.9%
21 36
4.5%
20 26
3.2%
19 34
4.2%
18 35
4.4%
17 39
4.9%
16 40
5.0%
15 29
3.6%
14 34
4.2%

number_of_vehicles_involved
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size51.6 KiB
1
466 
3
286 
4
 
25
2
 
23

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters800
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row3
3rd row1
4th row4
5th row1

Common Values

ValueCountFrequency (%)
1 466
58.2%
3 286
35.8%
4 25
 
3.1%
2 23
 
2.9%

Length

2023-08-25T07:13:25.947549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:26.099346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 466
58.2%
3 286
35.8%
4 25
 
3.1%
2 23
 
2.9%

Most occurring characters

ValueCountFrequency (%)
1 466
58.2%
3 286
35.8%
4 25
 
3.1%
2 23
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 800
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 466
58.2%
3 286
35.8%
4 25
 
3.1%
2 23
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
Common 800
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 466
58.2%
3 286
35.8%
4 25
 
3.1%
2 23
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 466
58.2%
3 286
35.8%
4 25
 
3.1%
2 23
 
2.9%

bodily_injuries
Categorical

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size51.6 KiB
0
276 
2
272 
1
252 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters800
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row2
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 276
34.5%
2 272
34.0%
1 252
31.5%

Length

2023-08-25T07:13:26.266294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:26.403351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 276
34.5%
2 272
34.0%
1 252
31.5%

Most occurring characters

ValueCountFrequency (%)
0 276
34.5%
2 272
34.0%
1 252
31.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 800
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 276
34.5%
2 272
34.0%
1 252
31.5%

Most occurring scripts

ValueCountFrequency (%)
Common 800
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 276
34.5%
2 272
34.0%
1 252
31.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 276
34.5%
2 272
34.0%
1 252
31.5%

witnesses
Categorical

Distinct4
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size51.6 KiB
1
208 
2
204 
0
196 
3
192 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters800
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row3
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1 208
26.0%
2 204
25.5%
0 196
24.5%
3 192
24.0%

Length

2023-08-25T07:13:26.552957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:26.687087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 208
26.0%
2 204
25.5%
0 196
24.5%
3 192
24.0%

Most occurring characters

ValueCountFrequency (%)
1 208
26.0%
2 204
25.5%
0 196
24.5%
3 192
24.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 800
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 208
26.0%
2 204
25.5%
0 196
24.5%
3 192
24.0%

Most occurring scripts

ValueCountFrequency (%)
Common 800
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 208
26.0%
2 204
25.5%
0 196
24.5%
3 192
24.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 208
26.0%
2 204
25.5%
0 196
24.5%
3 192
24.0%

total_claim_amount
Real number (ℝ)

HIGH CORRELATION 

Distinct639
Distinct (%)79.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52173.8
Minimum1920
Maximum114920
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2023-08-25T07:13:26.881607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1920
5-th percentile4300
Q139950
median57700
Q370425
95-th percentile90012
Maximum114920
Range113000
Interquartile range (IQR)30475

Descriptive statistics

Standard deviation27072.205
Coefficient of variation (CV)0.51888506
Kurtosis-0.57864658
Mean52173.8
Median Absolute Deviation (MAD)14830
Skewness-0.52770402
Sum41739040
Variance7.3290431 × 108
MonotonicityNot monotonic
2023-08-25T07:13:27.192803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
58500 4
 
0.5%
3190 4
 
0.5%
70400 4
 
0.5%
59400 4
 
0.5%
44200 4
 
0.5%
4700 3
 
0.4%
5000 3
 
0.4%
64080 3
 
0.4%
64800 3
 
0.4%
5060 3
 
0.4%
Other values (629) 765
95.6%
ValueCountFrequency (%)
1920 1
 
0.1%
2160 1
 
0.1%
2250 1
 
0.1%
2400 1
 
0.1%
2520 1
 
0.1%
2640 3
0.4%
2700 1
 
0.1%
2800 1
 
0.1%
2860 1
 
0.1%
2970 1
 
0.1%
ValueCountFrequency (%)
114920 1
0.1%
112320 1
0.1%
108480 1
0.1%
108030 1
0.1%
107900 1
0.1%
105820 1
0.1%
105040 1
0.1%
104610 1
0.1%
101860 1
0.1%
101010 1
0.1%

injury_claim
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct552
Distinct (%)69.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7373.5625
Minimum0
Maximum21450
Zeros18
Zeros (%)2.2%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2023-08-25T07:13:27.482457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile460
Q14065
median6675
Q311330
95-th percentile15786
Maximum21450
Range21450
Interquartile range (IQR)7265

Descriptive statistics

Standard deviation4947.4594
Coefficient of variation (CV)0.67097273
Kurtosis-0.76629945
Mean7373.5625
Median Absolute Deviation (MAD)3905
Skewness0.29576017
Sum5898850
Variance24477354
MonotonicityNot monotonic
2023-08-25T07:13:27.755496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 18
 
2.2%
640 6
 
0.8%
480 6
 
0.8%
1180 5
 
0.6%
860 5
 
0.6%
5540 5
 
0.6%
780 5
 
0.6%
660 5
 
0.6%
580 5
 
0.6%
6340 5
 
0.6%
Other values (542) 735
91.9%
ValueCountFrequency (%)
0 18
2.2%
220 1
 
0.1%
250 1
 
0.1%
280 2
 
0.2%
290 1
 
0.1%
300 2
 
0.2%
330 2
 
0.2%
360 1
 
0.1%
390 1
 
0.1%
400 2
 
0.2%
ValueCountFrequency (%)
21450 1
0.1%
21330 1
0.1%
20700 1
0.1%
19020 1
0.1%
18520 1
0.1%
18220 1
0.1%
18180 1
0.1%
18080 1
0.1%
18000 1
0.1%
17880 1
0.1%

property_claim
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct540
Distinct (%)67.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7324.225
Minimum0
Maximum23670
Zeros16
Zeros (%)2.0%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2023-08-25T07:13:28.014764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile449.5
Q14270
median6700
Q310940
95-th percentile15561
Maximum23670
Range23670
Interquartile range (IQR)6670

Descriptive statistics

Standard deviation4890.63
Coefficient of variation (CV)0.66773344
Kurtosis-0.42559132
Mean7324.225
Median Absolute Deviation (MAD)3565
Skewness0.39144091
Sum5859380
Variance23918261
MonotonicityNot monotonic
2023-08-25T07:13:28.218921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 16
 
2.0%
11080 5
 
0.6%
10000 5
 
0.6%
860 5
 
0.6%
6340 4
 
0.5%
660 4
 
0.5%
4420 4
 
0.5%
480 4
 
0.5%
590 4
 
0.5%
780 4
 
0.5%
Other values (530) 745
93.1%
ValueCountFrequency (%)
0 16
2.0%
240 1
 
0.1%
250 1
 
0.1%
260 1
 
0.1%
280 3
 
0.4%
290 2
 
0.2%
300 2
 
0.2%
320 3
 
0.4%
330 1
 
0.1%
380 1
 
0.1%
ValueCountFrequency (%)
23670 1
0.1%
21810 1
0.1%
21580 1
0.1%
21240 1
0.1%
20550 1
0.1%
20310 1
0.1%
19650 1
0.1%
19470 1
0.1%
19260 1
0.1%
19200 1
0.1%

vehicle_claim
Real number (ℝ)

HIGH CORRELATION 

Distinct618
Distinct (%)77.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37476.012
Minimum1440
Maximum79560
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2023-08-25T07:13:28.391137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1440
5-th percentile2968.5
Q128980
median41760
Q350762.5
95-th percentile63457.5
Maximum79560
Range78120
Interquartile range (IQR)21782.5

Descriptive statistics

Standard deviation19309.208
Coefficient of variation (CV)0.5152418
Kurtosis-0.56224385
Mean37476.012
Median Absolute Deviation (MAD)10065
Skewness-0.56298867
Sum29980810
Variance3.7284551 × 108
MonotonicityNot monotonic
2023-08-25T07:13:28.593714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5040 6
 
0.8%
44800 5
 
0.6%
4720 5
 
0.6%
33600 5
 
0.6%
3360 5
 
0.6%
52080 4
 
0.5%
41760 4
 
0.5%
35000 4
 
0.5%
46160 3
 
0.4%
2730 3
 
0.4%
Other values (608) 756
94.5%
ValueCountFrequency (%)
1440 2
0.2%
1680 2
0.2%
1750 1
 
0.1%
1800 1
 
0.1%
1960 2
0.2%
1980 1
 
0.1%
2030 1
 
0.1%
2080 1
 
0.1%
2100 2
0.2%
2240 3
0.4%
ValueCountFrequency (%)
79560 1
0.1%
77760 1
0.1%
77670 1
0.1%
76400 1
0.1%
76000 1
0.1%
75600 1
0.1%
75530 1
0.1%
74790 1
0.1%
73620 1
0.1%
73260 1
0.1%

auto_year
Categorical

Distinct21
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size53.9 KiB
2003
 
48
2007
 
47
2005
 
45
1999
 
44
2006
 
42
Other values (16)
574 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters3200
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2002
2nd row1998
3rd row2004
4th row1996
5th row2002

Common Values

ValueCountFrequency (%)
2003 48
 
6.0%
2007 47
 
5.9%
2005 45
 
5.6%
1999 44
 
5.5%
2006 42
 
5.2%
1995 42
 
5.2%
2002 41
 
5.1%
2015 39
 
4.9%
2009 38
 
4.8%
2013 37
 
4.6%
Other values (11) 377
47.1%

Length

2023-08-25T07:13:28.783463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2003 48
 
6.0%
2007 47
 
5.9%
2005 45
 
5.6%
1999 44
 
5.5%
2006 42
 
5.2%
1995 42
 
5.2%
2002 41
 
5.1%
2015 39
 
4.9%
2009 38
 
4.8%
2010 37
 
4.6%
Other values (11) 377
47.1%

Most occurring characters

ValueCountFrequency (%)
0 1087
34.0%
2 692
21.6%
1 474
14.8%
9 448
14.0%
5 126
 
3.9%
3 85
 
2.7%
7 76
 
2.4%
6 74
 
2.3%
8 70
 
2.2%
4 68
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3200
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1087
34.0%
2 692
21.6%
1 474
14.8%
9 448
14.0%
5 126
 
3.9%
3 85
 
2.7%
7 76
 
2.4%
6 74
 
2.3%
8 70
 
2.2%
4 68
 
2.1%

Most occurring scripts

ValueCountFrequency (%)
Common 3200
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1087
34.0%
2 692
21.6%
1 474
14.8%
9 448
14.0%
5 126
 
3.9%
3 85
 
2.7%
7 76
 
2.4%
6 74
 
2.3%
8 70
 
2.2%
4 68
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3200
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1087
34.0%
2 692
21.6%
1 474
14.8%
9 448
14.0%
5 126
 
3.9%
3 85
 
2.7%
7 76
 
2.4%
6 74
 
2.3%
8 70
 
2.2%
4 68
 
2.1%

policy_state
Categorical

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size52.3 KiB
OH
282 
IL
265 
IN
253 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1600
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIN
2nd rowIN
3rd rowIN
4th rowOH
5th rowOH

Common Values

ValueCountFrequency (%)
OH 282
35.2%
IL 265
33.1%
IN 253
31.6%

Length

2023-08-25T07:13:28.901961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:29.011478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
oh 282
35.2%
il 265
33.1%
in 253
31.6%

Most occurring characters

ValueCountFrequency (%)
I 518
32.4%
O 282
17.6%
H 282
17.6%
L 265
16.6%
N 253
15.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1600
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 518
32.4%
O 282
17.6%
H 282
17.6%
L 265
16.6%
N 253
15.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 1600
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 518
32.4%
O 282
17.6%
H 282
17.6%
L 265
16.6%
N 253
15.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1600
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 518
32.4%
O 282
17.6%
H 282
17.6%
L 265
16.6%
N 253
15.8%

policy_csl
Categorical

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size56.5 KiB
250/500
286 
100/300
274 
500/1000
240 

Length

Max length8
Median length7
Mean length7.3
Min length7

Characters and Unicode

Total characters5840
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row250/500
2nd row500/1000
3rd row100/300
4th row500/1000
5th row100/300

Common Values

ValueCountFrequency (%)
250/500 286
35.8%
100/300 274
34.2%
500/1000 240
30.0%

Length

2023-08-25T07:13:29.147861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:29.306148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
250/500 286
35.8%
100/300 274
34.2%
500/1000 240
30.0%

Most occurring characters

ValueCountFrequency (%)
0 3154
54.0%
5 812
 
13.9%
/ 800
 
13.7%
1 514
 
8.8%
2 286
 
4.9%
3 274
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5040
86.3%
Other Punctuation 800
 
13.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3154
62.6%
5 812
 
16.1%
1 514
 
10.2%
2 286
 
5.7%
3 274
 
5.4%
Other Punctuation
ValueCountFrequency (%)
/ 800
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5840
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3154
54.0%
5 812
 
13.9%
/ 800
 
13.7%
1 514
 
8.8%
2 286
 
4.9%
3 274
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5840
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3154
54.0%
5 812
 
13.9%
/ 800
 
13.7%
1 514
 
8.8%
2 286
 
4.9%
3 274
 
4.7%

insured_sex
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size54.7 KiB
FEMALE
426 
MALE
374 

Length

Max length6
Median length6
Mean length5.065
Min length4

Characters and Unicode

Total characters4052
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMALE
2nd rowFEMALE
3rd rowMALE
4th rowFEMALE
5th rowMALE

Common Values

ValueCountFrequency (%)
FEMALE 426
53.2%
MALE 374
46.8%

Length

2023-08-25T07:13:29.472705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:29.770799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
female 426
53.2%
male 374
46.8%

Most occurring characters

ValueCountFrequency (%)
E 1226
30.3%
M 800
19.7%
A 800
19.7%
L 800
19.7%
F 426
 
10.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4052
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1226
30.3%
M 800
19.7%
A 800
19.7%
L 800
19.7%
F 426
 
10.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 4052
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1226
30.3%
M 800
19.7%
A 800
19.7%
L 800
19.7%
F 426
 
10.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4052
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1226
30.3%
M 800
19.7%
A 800
19.7%
L 800
19.7%
F 426
 
10.5%

insured_hobbies
Categorical

Distinct20
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size57.1 KiB
reading
 
54
paintball
 
52
kayaking
 
46
movies
 
46
golf
 
44
Other values (15)
558 

Length

Max length14
Median length11
Mean length8.095
Min length4

Characters and Unicode

Total characters6476
Distinct characters24
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowexercise
2nd rowcamping
3rd rowreading
4th rowbasketball
5th rowbasketball

Common Values

ValueCountFrequency (%)
reading 54
 
6.8%
paintball 52
 
6.5%
kayaking 46
 
5.8%
movies 46
 
5.8%
golf 44
 
5.5%
bungie-jumping 44
 
5.5%
skydiving 43
 
5.4%
exercise 42
 
5.2%
base-jumping 40
 
5.0%
yachting 39
 
4.9%
Other values (10) 350
43.8%

Length

2023-08-25T07:13:29.897603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
reading 54
 
6.8%
paintball 52
 
6.5%
kayaking 46
 
5.8%
movies 46
 
5.8%
golf 44
 
5.5%
bungie-jumping 44
 
5.5%
skydiving 43
 
5.4%
exercise 42
 
5.2%
base-jumping 40
 
5.0%
hiking 39
 
4.9%
Other values (10) 350
43.8%

Most occurring characters

ValueCountFrequency (%)
i 747
 
11.5%
g 576
 
8.9%
a 565
 
8.7%
n 550
 
8.5%
e 548
 
8.5%
s 433
 
6.7%
l 267
 
4.1%
o 265
 
4.1%
p 244
 
3.8%
m 242
 
3.7%
Other values (14) 2039
31.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6291
97.1%
Dash Punctuation 185
 
2.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 747
11.9%
g 576
 
9.2%
a 565
 
9.0%
n 550
 
8.7%
e 548
 
8.7%
s 433
 
6.9%
l 267
 
4.2%
o 265
 
4.2%
p 244
 
3.9%
m 242
 
3.8%
Other values (13) 1854
29.5%
Dash Punctuation
ValueCountFrequency (%)
- 185
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6291
97.1%
Common 185
 
2.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 747
11.9%
g 576
 
9.2%
a 565
 
9.0%
n 550
 
8.7%
e 548
 
8.7%
s 433
 
6.9%
l 267
 
4.2%
o 265
 
4.2%
p 244
 
3.9%
m 242
 
3.8%
Other values (13) 1854
29.5%
Common
ValueCountFrequency (%)
- 185
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6476
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 747
 
11.5%
g 576
 
8.9%
a 565
 
8.7%
n 550
 
8.5%
e 548
 
8.5%
s 433
 
6.7%
l 267
 
4.1%
o 265
 
4.1%
p 244
 
3.8%
m 242
 
3.7%
Other values (14) 2039
31.5%

incident_type
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size67.3 KiB
Multi-vehicle Collision
334 
Single Vehicle Collision
314 
Parked Car
77 
Vehicle Theft
75 

Length

Max length24
Median length23
Mean length21.20375
Min length10

Characters and Unicode

Total characters16963
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowParked Car
2nd rowMulti-vehicle Collision
3rd rowSingle Vehicle Collision
4th rowMulti-vehicle Collision
5th rowSingle Vehicle Collision

Common Values

ValueCountFrequency (%)
Multi-vehicle Collision 334
41.8%
Single Vehicle Collision 314
39.2%
Parked Car 77
 
9.6%
Vehicle Theft 75
 
9.4%

Length

2023-08-25T07:13:30.047897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:30.183733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
collision 648
33.9%
vehicle 389
20.3%
multi-vehicle 334
17.5%
single 314
16.4%
parked 77
 
4.0%
car 77
 
4.0%
theft 75
 
3.9%

Most occurring characters

ValueCountFrequency (%)
l 2667
15.7%
i 2667
15.7%
e 1912
11.3%
o 1296
 
7.6%
1114
 
6.6%
n 962
 
5.7%
h 798
 
4.7%
C 725
 
4.3%
c 723
 
4.3%
s 648
 
3.8%
Other values (15) 3451
20.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13601
80.2%
Uppercase Letter 1914
 
11.3%
Space Separator 1114
 
6.6%
Dash Punctuation 334
 
2.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 2667
19.6%
i 2667
19.6%
e 1912
14.1%
o 1296
9.5%
n 962
 
7.1%
h 798
 
5.9%
c 723
 
5.3%
s 648
 
4.8%
t 409
 
3.0%
u 334
 
2.5%
Other values (7) 1185
8.7%
Uppercase Letter
ValueCountFrequency (%)
C 725
37.9%
V 389
20.3%
M 334
17.5%
S 314
16.4%
P 77
 
4.0%
T 75
 
3.9%
Space Separator
ValueCountFrequency (%)
1114
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 334
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15515
91.5%
Common 1448
 
8.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 2667
17.2%
i 2667
17.2%
e 1912
12.3%
o 1296
8.4%
n 962
 
6.2%
h 798
 
5.1%
C 725
 
4.7%
c 723
 
4.7%
s 648
 
4.2%
t 409
 
2.6%
Other values (13) 2708
17.5%
Common
ValueCountFrequency (%)
1114
76.9%
- 334
 
23.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16963
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 2667
15.7%
i 2667
15.7%
e 1912
11.3%
o 1296
 
7.6%
1114
 
6.6%
n 962
 
5.7%
h 798
 
4.7%
C 725
 
4.3%
c 723
 
4.3%
s 648
 
3.8%
Other values (15) 3451
20.3%

collision_type
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size60.9 KiB
Rear Collision
232 
Side Collision
212 
Front Collision
204 
UNKNOWN
152 

Length

Max length15
Median length14
Mean length12.925
Min length7

Characters and Unicode

Total characters10340
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUNKNOWN
2nd rowSide Collision
3rd rowRear Collision
4th rowSide Collision
5th rowRear Collision

Common Values

ValueCountFrequency (%)
Rear Collision 232
29.0%
Side Collision 212
26.5%
Front Collision 204
25.5%
UNKNOWN 152
19.0%

Length

2023-08-25T07:13:30.343001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:30.480087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
collision 648
44.8%
rear 232
 
16.0%
side 212
 
14.6%
front 204
 
14.1%
unknown 152
 
10.5%

Most occurring characters

ValueCountFrequency (%)
i 1508
14.6%
o 1500
14.5%
l 1296
12.5%
n 852
8.2%
648
 
6.3%
C 648
 
6.3%
s 648
 
6.3%
N 456
 
4.4%
e 444
 
4.3%
r 436
 
4.2%
Other values (10) 1904
18.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7332
70.9%
Uppercase Letter 2360
 
22.8%
Space Separator 648
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1508
20.6%
o 1500
20.5%
l 1296
17.7%
n 852
11.6%
s 648
8.8%
e 444
 
6.1%
r 436
 
5.9%
a 232
 
3.2%
d 212
 
2.9%
t 204
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
C 648
27.5%
N 456
19.3%
R 232
 
9.8%
S 212
 
9.0%
F 204
 
8.6%
U 152
 
6.4%
K 152
 
6.4%
O 152
 
6.4%
W 152
 
6.4%
Space Separator
ValueCountFrequency (%)
648
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9692
93.7%
Common 648
 
6.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1508
15.6%
o 1500
15.5%
l 1296
13.4%
n 852
8.8%
C 648
 
6.7%
s 648
 
6.7%
N 456
 
4.7%
e 444
 
4.6%
r 436
 
4.5%
R 232
 
2.4%
Other values (9) 1672
17.3%
Common
ValueCountFrequency (%)
648
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10340
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1508
14.6%
o 1500
14.5%
l 1296
12.5%
n 852
8.2%
648
 
6.3%
C 648
 
6.3%
s 648
 
6.3%
N 456
 
4.4%
e 444
 
4.3%
r 436
 
4.2%
Other values (10) 1904
18.4%
Distinct4
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size59.9 KiB
Minor Damage
281 
Total Loss
225 
Major Damage
219 
Trivial Damage
75 

Length

Max length14
Median length12
Mean length11.625
Min length10

Characters and Unicode

Total characters9300
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMinor Damage
2nd rowTotal Loss
3rd rowMajor Damage
4th rowMajor Damage
5th rowTotal Loss

Common Values

ValueCountFrequency (%)
Minor Damage 281
35.1%
Total Loss 225
28.1%
Major Damage 219
27.4%
Trivial Damage 75
 
9.4%

Length

2023-08-25T07:13:30.620672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:30.750582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
damage 575
35.9%
minor 281
17.6%
total 225
 
14.1%
loss 225
 
14.1%
major 219
 
13.7%
trivial 75
 
4.7%

Most occurring characters

ValueCountFrequency (%)
a 1669
17.9%
o 950
10.2%
800
 
8.6%
g 575
 
6.2%
m 575
 
6.2%
e 575
 
6.2%
r 575
 
6.2%
D 575
 
6.2%
M 500
 
5.4%
s 450
 
4.8%
Other values (8) 2056
22.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6900
74.2%
Uppercase Letter 1600
 
17.2%
Space Separator 800
 
8.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1669
24.2%
o 950
13.8%
g 575
 
8.3%
m 575
 
8.3%
e 575
 
8.3%
r 575
 
8.3%
s 450
 
6.5%
i 431
 
6.2%
l 300
 
4.3%
n 281
 
4.1%
Other values (3) 519
 
7.5%
Uppercase Letter
ValueCountFrequency (%)
D 575
35.9%
M 500
31.2%
T 300
18.8%
L 225
 
14.1%
Space Separator
ValueCountFrequency (%)
800
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8500
91.4%
Common 800
 
8.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1669
19.6%
o 950
11.2%
g 575
 
6.8%
m 575
 
6.8%
e 575
 
6.8%
r 575
 
6.8%
D 575
 
6.8%
M 500
 
5.9%
s 450
 
5.3%
i 431
 
5.1%
Other values (7) 1625
19.1%
Common
ValueCountFrequency (%)
800
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9300
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1669
17.9%
o 950
10.2%
800
 
8.6%
g 575
 
6.2%
m 575
 
6.2%
e 575
 
6.2%
r 575
 
6.2%
D 575
 
6.2%
M 500
 
5.4%
s 450
 
4.8%
Other values (8) 2056
22.1%
Distinct5
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size55.3 KiB
Police
235 
Fire
173 
Other
156 
Ambulance
154 
None
82 

Length

Max length9
Median length6
Mean length5.745
Min length4

Characters and Unicode

Total characters4596
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone
2nd rowAmbulance
3rd rowAmbulance
4th rowPolice
5th rowOther

Common Values

ValueCountFrequency (%)
Police 235
29.4%
Fire 173
21.6%
Other 156
19.5%
Ambulance 154
19.2%
None 82
 
10.2%

Length

2023-08-25T07:13:30.895647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:31.024743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
police 235
29.4%
fire 173
21.6%
other 156
19.5%
ambulance 154
19.2%
none 82
 
10.2%

Most occurring characters

ValueCountFrequency (%)
e 800
17.4%
i 408
 
8.9%
l 389
 
8.5%
c 389
 
8.5%
r 329
 
7.2%
o 317
 
6.9%
n 236
 
5.1%
P 235
 
5.1%
F 173
 
3.8%
h 156
 
3.4%
Other values (8) 1164
25.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3796
82.6%
Uppercase Letter 800
 
17.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 800
21.1%
i 408
10.7%
l 389
10.2%
c 389
10.2%
r 329
8.7%
o 317
 
8.4%
n 236
 
6.2%
h 156
 
4.1%
t 156
 
4.1%
m 154
 
4.1%
Other values (3) 462
12.2%
Uppercase Letter
ValueCountFrequency (%)
P 235
29.4%
F 173
21.6%
O 156
19.5%
A 154
19.2%
N 82
 
10.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 4596
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 800
17.4%
i 408
 
8.9%
l 389
 
8.5%
c 389
 
8.5%
r 329
 
7.2%
o 317
 
6.9%
n 236
 
5.1%
P 235
 
5.1%
F 173
 
3.8%
h 156
 
3.4%
Other values (8) 1164
25.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4596
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 800
17.4%
i 408
 
8.9%
l 389
 
8.5%
c 389
 
8.5%
r 329
 
7.2%
o 317
 
6.9%
n 236
 
5.1%
P 235
 
5.1%
F 173
 
3.8%
h 156
 
3.4%
Other values (8) 1164
25.3%

incident_state
Categorical

Distinct7
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size52.3 KiB
NY
205 
SC
194 
WV
174 
NC
94 
VA
87 
Other values (2)
46 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1600
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowVA
2nd rowVA
3rd rowNY
4th rowWV
5th rowWV

Common Values

ValueCountFrequency (%)
NY 205
25.6%
SC 194
24.2%
WV 174
21.8%
NC 94
11.8%
VA 87
10.9%
PA 28
 
3.5%
OH 18
 
2.2%

Length

2023-08-25T07:13:31.156693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:31.272730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
ny 205
25.6%
sc 194
24.2%
wv 174
21.8%
nc 94
11.8%
va 87
10.9%
pa 28
 
3.5%
oh 18
 
2.2%

Most occurring characters

ValueCountFrequency (%)
N 299
18.7%
C 288
18.0%
V 261
16.3%
Y 205
12.8%
S 194
12.1%
W 174
10.9%
A 115
 
7.2%
P 28
 
1.8%
O 18
 
1.1%
H 18
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1600
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 299
18.7%
C 288
18.0%
V 261
16.3%
Y 205
12.8%
S 194
12.1%
W 174
10.9%
A 115
 
7.2%
P 28
 
1.8%
O 18
 
1.1%
H 18
 
1.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1600
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 299
18.7%
C 288
18.0%
V 261
16.3%
Y 205
12.8%
S 194
12.1%
W 174
10.9%
A 115
 
7.2%
P 28
 
1.8%
O 18
 
1.1%
H 18
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1600
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 299
18.7%
C 288
18.0%
V 261
16.3%
Y 205
12.8%
S 194
12.1%
W 174
10.9%
A 115
 
7.2%
P 28
 
1.8%
O 18
 
1.1%
H 18
 
1.1%

incident_city
Categorical

Distinct7
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size58.0 KiB
Springfield
127 
Arlington
122 
Columbus
119 
Hillsdale
118 
Northbend
113 
Other values (2)
201 

Length

Max length11
Median length9
Mean length9.2825
Min length8

Characters and Unicode

Total characters7426
Distinct characters26
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowArlington
2nd rowNorthbend
3rd rowHillsdale
4th rowNorthbrook
5th rowRiverwood

Common Values

ValueCountFrequency (%)
Springfield 127
15.9%
Arlington 122
15.2%
Columbus 119
14.9%
Hillsdale 118
14.8%
Northbend 113
14.1%
Riverwood 110
13.8%
Northbrook 91
11.4%

Length

2023-08-25T07:13:31.445687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:31.587407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
springfield 127
15.9%
arlington 122
15.2%
columbus 119
14.9%
hillsdale 118
14.8%
northbend 113
14.1%
riverwood 110
13.8%
northbrook 91
11.4%

Most occurring characters

ValueCountFrequency (%)
o 847
 
11.4%
l 722
 
9.7%
r 654
 
8.8%
i 604
 
8.1%
n 484
 
6.5%
d 468
 
6.3%
e 468
 
6.3%
t 326
 
4.4%
b 323
 
4.3%
g 249
 
3.4%
Other values (16) 2281
30.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6626
89.2%
Uppercase Letter 800
 
10.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 847
12.8%
l 722
10.9%
r 654
9.9%
i 604
9.1%
n 484
 
7.3%
d 468
 
7.1%
e 468
 
7.1%
t 326
 
4.9%
b 323
 
4.9%
g 249
 
3.8%
Other values (10) 1481
22.4%
Uppercase Letter
ValueCountFrequency (%)
N 204
25.5%
S 127
15.9%
A 122
15.2%
C 119
14.9%
H 118
14.8%
R 110
13.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 7426
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 847
 
11.4%
l 722
 
9.7%
r 654
 
8.8%
i 604
 
8.1%
n 484
 
6.5%
d 468
 
6.3%
e 468
 
6.3%
t 326
 
4.4%
b 323
 
4.3%
g 249
 
3.4%
Other values (16) 2281
30.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7426
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 847
 
11.4%
l 722
 
9.7%
r 654
 
8.8%
i 604
 
8.1%
n 484
 
6.5%
d 468
 
6.3%
e 468
 
6.3%
t 326
 
4.4%
b 323
 
4.3%
g 249
 
3.4%
Other values (16) 2281
30.7%

property_damage
Categorical

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size54.0 KiB
UNKNOWN
286 
NO
282 
YES
232 

Length

Max length7
Median length3
Mean length4.0775
Min length2

Characters and Unicode

Total characters3262
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowYES
3rd rowUNKNOWN
4th rowUNKNOWN
5th rowNO

Common Values

ValueCountFrequency (%)
UNKNOWN 286
35.8%
NO 282
35.2%
YES 232
29.0%

Length

2023-08-25T07:13:31.751744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:31.892034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
unknown 286
35.8%
no 282
35.2%
yes 232
29.0%

Most occurring characters

ValueCountFrequency (%)
N 1140
34.9%
O 568
17.4%
U 286
 
8.8%
K 286
 
8.8%
W 286
 
8.8%
Y 232
 
7.1%
E 232
 
7.1%
S 232
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3262
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 1140
34.9%
O 568
17.4%
U 286
 
8.8%
K 286
 
8.8%
W 286
 
8.8%
Y 232
 
7.1%
E 232
 
7.1%
S 232
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 3262
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 1140
34.9%
O 568
17.4%
U 286
 
8.8%
K 286
 
8.8%
W 286
 
8.8%
Y 232
 
7.1%
E 232
 
7.1%
S 232
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3262
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1140
34.9%
O 568
17.4%
U 286
 
8.8%
K 286
 
8.8%
W 286
 
8.8%
Y 232
 
7.1%
E 232
 
7.1%
S 232
 
7.1%
Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size53.9 KiB
NO
278 
UNKNOWN
268 
YES
254 

Length

Max length7
Median length3
Mean length3.9925
Min length2

Characters and Unicode

Total characters3194
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowUNKNOWN
3rd rowUNKNOWN
4th rowUNKNOWN
5th rowNO

Common Values

ValueCountFrequency (%)
NO 278
34.8%
UNKNOWN 268
33.5%
YES 254
31.8%

Length

2023-08-25T07:13:32.053672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-25T07:13:32.176291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
no 278
34.8%
unknown 268
33.5%
yes 254
31.8%

Most occurring characters

ValueCountFrequency (%)
N 1082
33.9%
O 546
17.1%
U 268
 
8.4%
K 268
 
8.4%
W 268
 
8.4%
Y 254
 
8.0%
E 254
 
8.0%
S 254
 
8.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3194
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 1082
33.9%
O 546
17.1%
U 268
 
8.4%
K 268
 
8.4%
W 268
 
8.4%
Y 254
 
8.0%
E 254
 
8.0%
S 254
 
8.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3194
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 1082
33.9%
O 546
17.1%
U 268
 
8.4%
K 268
 
8.4%
W 268
 
8.4%
Y 254
 
8.0%
E 254
 
8.0%
S 254
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3194
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1082
33.9%
O 546
17.1%
U 268
 
8.4%
K 268
 
8.4%
W 268
 
8.4%
Y 254
 
8.0%
E 254
 
8.0%
S 254
 
8.0%

auto_make
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size55.2 KiB
Suburu
70 
Saab
66 
Chevrolet
59 
Ford
59 
BMW
59 
Other values (9)
487 

Length

Max length10
Median length9
Mean length5.69875
Min length3

Characters and Unicode

Total characters4559
Distinct characters33
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHonda
2nd rowAccura
3rd rowMercedes
4th rowChevrolet
5th rowAccura

Common Values

ValueCountFrequency (%)
Suburu 70
 
8.8%
Saab 66
 
8.2%
Chevrolet 59
 
7.4%
Ford 59
 
7.4%
BMW 59
 
7.4%
Accura 58
 
7.2%
Audi 57
 
7.1%
Dodge 57
 
7.1%
Volkswagen 56
 
7.0%
Nissan 56
 
7.0%
Other values (4) 203
25.4%

Length

2023-08-25T07:13:32.315968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
suburu 70
 
8.8%
saab 66
 
8.2%
chevrolet 59
 
7.4%
ford 59
 
7.4%
bmw 59
 
7.4%
accura 58
 
7.2%
audi 57
 
7.1%
dodge 57
 
7.1%
volkswagen 56
 
7.0%
nissan 56
 
7.0%
Other values (4) 203
25.4%

Most occurring characters

ValueCountFrequency (%)
e 496
 
10.9%
a 399
 
8.8%
o 381
 
8.4%
u 325
 
7.1%
r 299
 
6.6%
d 270
 
5.9%
s 221
 
4.8%
c 169
 
3.7%
n 156
 
3.4%
S 136
 
3.0%
Other values (23) 1707
37.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3641
79.9%
Uppercase Letter 918
 
20.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 496
13.6%
a 399
11.0%
o 381
10.5%
u 325
8.9%
r 299
 
8.2%
d 270
 
7.4%
s 221
 
6.1%
c 169
 
4.6%
n 156
 
4.3%
b 136
 
3.7%
Other values (10) 789
21.7%
Uppercase Letter
ValueCountFrequency (%)
S 136
14.8%
A 115
12.5%
M 112
12.2%
W 59
 
6.4%
B 59
 
6.4%
F 59
 
6.4%
C 59
 
6.4%
D 57
 
6.2%
V 56
 
6.1%
N 56
 
6.1%
Other values (3) 150
16.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 4559
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 496
 
10.9%
a 399
 
8.8%
o 381
 
8.4%
u 325
 
7.1%
r 299
 
6.6%
d 270
 
5.9%
s 221
 
4.8%
c 169
 
3.7%
n 156
 
3.4%
S 136
 
3.0%
Other values (23) 1707
37.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4559
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 496
 
10.9%
a 399
 
8.8%
o 381
 
8.4%
u 325
 
7.1%
r 299
 
6.6%
d 270
 
5.9%
s 221
 
4.8%
c 169
 
3.7%
n 156
 
3.4%
S 136
 
3.0%
Other values (23) 1707
37.4%

auto_model
Categorical

HIGH CORRELATION 

Distinct39
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size54.8 KiB
Wrangler
 
33
A3
 
32
Jetta
 
32
MDX
 
31
RAM
 
31
Other values (34)
641 

Length

Max length14
Median length9
Mean length5.12
Min length2

Characters and Unicode

Total characters4096
Distinct characters52
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCivic
2nd rowMDX
3rd rowE400
4th rowSilverado
5th rowTL

Common Values

ValueCountFrequency (%)
Wrangler 33
 
4.1%
A3 32
 
4.0%
Jetta 32
 
4.0%
MDX 31
 
3.9%
RAM 31
 
3.9%
Legacy 30
 
3.8%
Neon 26
 
3.2%
Forrestor 26
 
3.2%
92x 25
 
3.1%
A5 25
 
3.1%
Other values (29) 509
63.6%

Length

2023-08-25T07:13:32.490513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
wrangler 33
 
4.0%
a3 32
 
3.8%
jetta 32
 
3.8%
mdx 31
 
3.7%
ram 31
 
3.7%
legacy 30
 
3.6%
neon 26
 
3.1%
forrestor 26
 
3.1%
a5 25
 
3.0%
92x 25
 
3.0%
Other values (31) 543
65.1%

Most occurring characters

ValueCountFrequency (%)
a 381
 
9.3%
e 339
 
8.3%
r 309
 
7.5%
o 191
 
4.7%
i 180
 
4.4%
t 153
 
3.7%
l 139
 
3.4%
n 133
 
3.2%
M 133
 
3.2%
s 125
 
3.1%
Other values (42) 2013
49.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2624
64.1%
Uppercase Letter 965
 
23.6%
Decimal Number 473
 
11.5%
Space Separator 34
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 381
14.5%
e 339
12.9%
r 309
11.8%
o 191
 
7.3%
i 180
 
6.9%
t 153
 
5.8%
l 139
 
5.3%
n 133
 
5.1%
s 125
 
4.8%
c 87
 
3.3%
Other values (13) 587
22.4%
Uppercase Letter
ValueCountFrequency (%)
M 133
13.8%
C 103
10.7%
A 98
 
10.2%
X 74
 
7.7%
F 66
 
6.8%
L 63
 
6.5%
R 58
 
6.0%
P 43
 
4.5%
E 42
 
4.4%
S 40
 
4.1%
Other values (10) 245
25.4%
Decimal Number
ValueCountFrequency (%)
5 116
24.5%
0 111
23.5%
3 96
20.3%
9 66
14.0%
2 25
 
5.3%
4 23
 
4.9%
1 22
 
4.7%
6 14
 
3.0%
Space Separator
ValueCountFrequency (%)
34
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3589
87.6%
Common 507
 
12.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 381
 
10.6%
e 339
 
9.4%
r 309
 
8.6%
o 191
 
5.3%
i 180
 
5.0%
t 153
 
4.3%
l 139
 
3.9%
n 133
 
3.7%
M 133
 
3.7%
s 125
 
3.5%
Other values (33) 1506
42.0%
Common
ValueCountFrequency (%)
5 116
22.9%
0 111
21.9%
3 96
18.9%
9 66
13.0%
34
 
6.7%
2 25
 
4.9%
4 23
 
4.5%
1 22
 
4.3%
6 14
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4096
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 381
 
9.3%
e 339
 
8.3%
r 309
 
7.5%
o 191
 
4.7%
i 180
 
4.4%
t 153
 
3.7%
l 139
 
3.4%
n 133
 
3.2%
M 133
 
3.2%
s 125
 
3.1%
Other values (42) 2013
49.1%
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
False
602 
True
198 
ValueCountFrequency (%)
False 602
75.2%
True 198
 
24.8%
2023-08-25T07:13:32.612145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-08-25T07:13:20.023346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:03.554948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:05.003567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:06.366487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:07.840286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:09.503489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:11.068575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:12.515620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:13.894817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:15.242802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:17.034025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:18.568963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:20.143483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:03.679306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:05.109540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:06.479259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:07.963212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:09.616101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:11.185965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:12.616280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:14.020020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:15.364604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:17.197903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:18.680429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:20.243520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:03.785510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:05.206171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:06.597176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:08.072932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:09.739169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:11.299904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:12.719381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:14.118390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:15.473654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:17.329536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:18.800115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:20.352191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:03.893877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:05.308364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:06.753221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:08.183138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:09.871840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:11.403544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:12.838119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:14.226241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:15.590271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:17.455828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:18.917826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:20.471498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:04.031779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:05.423976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:06.932175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:08.477605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:10.016538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:11.537773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:12.959732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:14.352287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:15.715812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:17.632664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:19.083069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:20.597709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:04.157242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:05.547504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:07.073662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:08.612715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:10.146738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:11.660678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:13.088463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:14.485944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:16.023563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:17.746978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:19.212928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:20.734639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:04.272241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:05.659427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:07.180932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:08.740881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:10.281223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:11.780401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:13.202608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:14.596265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:16.133434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:17.866572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:19.331385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:20.867380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:04.391470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:05.775563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:07.291152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:08.854846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:10.415313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:11.898410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:13.313503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:14.707095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:16.271230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:17.986709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:19.445601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:20.990355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:04.500128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:05.888253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:07.391856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:08.973507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:10.551564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:12.023104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:13.433118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:14.806698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:16.419965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:18.099705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:19.562060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:21.097908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:04.615492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:06.010953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:07.502697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:09.120384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:10.679688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:12.158209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:13.548110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:14.913616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:16.609779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:18.208309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:19.696365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:21.205058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:04.727338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:06.119475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:07.607866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:09.239952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:10.800415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:12.285300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:13.660485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:15.017139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:16.752706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:18.310415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:19.804677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:21.313263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:04.890136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:06.230618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:07.721187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:09.371017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:10.927900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:12.396926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:13.760317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:15.123465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:16.866693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:18.421914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-25T07:13:19.916530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-08-25T07:13:32.722904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
months_as_customeragepolicy_numberpolicy_annual_premiuminsured_zipcapital-gainscapital-lossincident_hour_of_the_daytotal_claim_amountinjury_claimproperty_claimvehicle_claimpolicy_deductableumbrella_limitnumber_of_vehicles_involvedbodily_injurieswitnessesauto_yearpolicy_statepolicy_cslinsured_sexinsured_hobbiesincident_typecollision_typeincident_severityauthorities_contactedincident_stateincident_cityproperty_damagepolice_report_availableauto_makeauto_modelfraud_reported
months_as_customer1.0000.9120.0330.0000.001-0.0100.0040.1010.0450.0620.0110.0440.0000.0000.0000.0580.0000.0000.0460.0000.1100.0420.0000.0000.0490.0000.0000.0380.0000.0770.0000.0000.000
age0.9121.0000.0460.011-0.008-0.036-0.0170.1140.0630.0770.0460.0520.0000.0000.1040.0000.0190.0360.0230.0220.0620.0000.0000.0000.0410.0000.0000.0000.0000.0000.0000.0000.000
policy_number0.0330.0461.0000.0110.014-0.016-0.0250.013-0.028-0.023-0.018-0.0330.0000.0380.0000.0000.0430.0000.0560.0120.0000.0270.0130.0500.0000.0240.0000.0330.0670.0280.0420.0300.000
policy_annual_premium0.0000.0110.0111.0000.057-0.0240.037-0.018-0.000-0.0220.0050.0130.0650.0000.0000.0000.0270.0000.0000.0880.1590.0000.0000.0000.0000.0000.0000.0000.0600.0460.0000.0000.041
insured_zip0.001-0.0080.0140.0571.000-0.0320.055-0.0090.023-0.0030.0230.0070.0000.0880.0310.0000.0000.0520.0290.0000.0000.0250.0240.0000.0000.0000.0270.0340.0260.0810.0000.0000.069
capital-gains-0.010-0.036-0.016-0.024-0.0321.000-0.065-0.0370.0070.0190.0020.0020.0000.0000.0900.0760.0390.0630.0000.0800.0000.0470.0270.0000.0160.0000.0000.0000.0000.0000.0240.0000.000
capital-loss0.004-0.017-0.0250.0370.055-0.0651.000-0.017-0.046-0.054-0.019-0.0420.0290.0290.0000.0000.0000.0160.0580.0000.0890.0360.0000.0510.0040.0270.0000.0500.0000.0000.0060.0000.000
incident_hour_of_the_day0.1010.1140.013-0.018-0.009-0.037-0.0171.0000.1970.1750.1880.1940.0850.0140.1180.0000.0270.0420.0000.0000.0000.0000.2690.2840.1960.1760.0000.0000.0000.0230.0000.0000.081
total_claim_amount0.0450.063-0.028-0.0000.0230.007-0.0460.1971.0000.8120.8160.9690.0000.0250.2440.0550.0000.0520.0000.0000.0570.0310.5760.5750.4310.3880.0270.0000.0000.0660.0000.0260.150
injury_claim0.0620.077-0.023-0.022-0.0030.019-0.0540.1750.8121.0000.5980.7180.0560.0000.2160.0760.0000.0000.0400.0000.0000.0340.5400.5390.3950.3710.0000.0000.0000.0370.0470.0480.106
property_claim0.0110.046-0.0180.0050.0230.002-0.0190.1880.8160.5981.0000.7200.0360.0460.2150.0610.0000.0370.0000.0370.0200.0000.5390.5430.3990.3650.0000.0140.0290.0000.0000.0710.174
vehicle_claim0.0440.052-0.0330.0130.0070.002-0.0420.1940.9690.7180.7201.0000.0350.0380.2390.0620.0000.0280.0000.0380.0000.0200.5760.5760.4310.3830.0530.0000.0000.0900.0000.0000.159
policy_deductable0.0000.0000.0000.0650.0000.0000.0290.0850.0000.0560.0360.0351.0000.0000.0720.0430.0200.0320.0000.0000.0000.0000.0480.0000.0000.0000.0560.0000.0000.0000.0000.0000.000
umbrella_limit0.0000.0000.0380.0000.0880.0000.0290.0140.0250.0000.0460.0380.0001.0000.0970.0870.0000.0250.0600.0000.0000.0220.0000.0000.0450.0170.0970.0560.0000.0570.0000.0000.052
number_of_vehicles_involved0.0000.1040.0000.0000.0310.0900.0000.1180.2440.2160.2150.2390.0720.0971.0000.0000.0000.0560.0000.0000.0280.0000.5750.2360.1810.1800.0000.0320.0000.0000.0000.0710.000
bodily_injuries0.0580.0000.0000.0000.0000.0760.0000.0000.0550.0760.0610.0620.0430.0870.0001.0000.0000.0970.0610.0000.0000.0000.0000.0000.0000.0000.0000.0110.0000.0000.0000.0000.042
witnesses0.0000.0190.0430.0270.0000.0390.0000.0270.0000.0000.0000.0000.0200.0000.0000.0001.0000.0740.0000.0230.0200.0550.0000.0550.0440.0070.0330.0420.0250.0000.0000.0000.061
auto_year0.0000.0360.0000.0000.0520.0630.0160.0420.0520.0000.0370.0280.0320.0250.0560.0970.0741.0000.0000.0000.0830.0000.0390.0000.0000.0550.0540.0260.0340.0000.0000.0000.042
policy_state0.0460.0230.0560.0000.0290.0000.0580.0000.0000.0400.0000.0000.0000.0600.0000.0610.0000.0001.0000.0000.0000.0000.0000.0260.0000.0290.0000.0000.0440.0450.0820.0880.000
policy_csl0.0000.0220.0120.0880.0000.0800.0000.0000.0000.0000.0370.0380.0000.0000.0000.0000.0230.0000.0001.0000.0000.0000.0210.0620.0000.0390.0000.0000.0000.0610.0000.0530.000
insured_sex0.1100.0620.0000.1590.0000.0000.0890.0000.0570.0000.0200.0000.0000.0000.0280.0000.0200.0830.0000.0001.0000.0810.0120.0000.0000.0640.0620.0360.0200.0260.0000.0850.000
insured_hobbies0.0420.0000.0270.0000.0250.0470.0360.0000.0310.0340.0000.0200.0000.0220.0000.0000.0550.0000.0000.0000.0811.0000.0440.0680.0000.0000.0520.0250.0000.1070.0290.0380.436
incident_type0.0000.0000.0130.0000.0240.0270.0000.2690.5760.5400.5390.5760.0480.0000.5750.0000.0000.0390.0000.0210.0120.0441.0000.5760.4240.4480.0630.0310.0000.0000.0000.0980.158
collision_type0.0000.0000.0500.0000.0000.0000.0510.2840.5750.5390.5430.5760.0000.0000.2360.0000.0550.0000.0260.0620.0000.0680.5761.0000.4240.4450.0700.0000.0000.0000.0000.0000.157
incident_severity0.0490.0410.0000.0000.0000.0160.0040.1960.4310.3950.3990.4310.0000.0450.1810.0000.0440.0000.0000.0000.0000.0000.4240.4241.0000.3130.0410.0000.0710.0000.0000.0000.498
authorities_contacted0.0000.0000.0240.0000.0000.0000.0270.1760.3880.3710.3650.3830.0000.0170.1800.0000.0070.0550.0290.0390.0640.0000.4480.4450.3131.0000.0100.0000.0000.0000.0000.0000.153
incident_state0.0000.0000.0000.0000.0270.0000.0000.0000.0270.0000.0000.0530.0560.0970.0000.0000.0330.0540.0000.0000.0620.0520.0630.0700.0410.0101.0000.0000.0000.0640.0100.0450.135
incident_city0.0380.0000.0330.0000.0340.0000.0500.0000.0000.0000.0140.0000.0000.0560.0320.0110.0420.0260.0000.0000.0360.0250.0310.0000.0000.0000.0001.0000.0840.0480.0140.0930.000
property_damage0.0000.0000.0670.0600.0260.0000.0000.0000.0000.0000.0290.0000.0000.0000.0000.0000.0250.0340.0440.0000.0200.0000.0000.0000.0710.0000.0000.0841.0000.0000.0000.0600.084
police_report_available0.0770.0000.0280.0460.0810.0000.0000.0230.0660.0370.0000.0900.0000.0570.0000.0000.0000.0000.0450.0610.0260.1070.0000.0000.0000.0000.0640.0480.0001.0000.0000.0000.000
auto_make0.0000.0000.0420.0000.0000.0240.0060.0000.0000.0470.0000.0000.0000.0000.0000.0000.0000.0000.0820.0000.0000.0290.0000.0000.0000.0000.0100.0140.0000.0001.0000.9840.000
auto_model0.0000.0000.0300.0000.0000.0000.0000.0000.0260.0480.0710.0000.0000.0000.0710.0000.0000.0000.0880.0530.0850.0380.0980.0000.0000.0000.0450.0930.0600.0000.9841.0000.113
fraud_reported0.0000.0000.0000.0410.0690.0000.0000.0810.1500.1060.1740.1590.0000.0520.0000.0420.0610.0420.0000.0000.0000.4360.1580.1570.4980.1530.1350.0000.0840.0000.0000.1131.000

Missing values

2023-08-25T07:13:21.526182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-08-25T07:13:22.003554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

months_as_customeragepolicy_numberpolicy_deductablepolicy_annual_premiumumbrella_limitinsured_zipcapital-gainscapital-lossincident_hour_of_the_daynumber_of_vehicles_involvedbodily_injurieswitnessestotal_claim_amountinjury_claimproperty_claimvehicle_claimauto_yearpolicy_statepolicy_cslinsured_sexinsured_hobbiesincident_typecollision_typeincident_severityauthorities_contactedincident_stateincident_cityproperty_damagepolice_report_availableauto_makeauto_modelfraud_reported
887441556695015001270400000044942124000-505004100640064064051202002IN250/500MALEexerciseParked CarUNKNOWNMinor DamageNoneVAArlingtonNONOHondaCivicN
3172754540373720001447060575639400-63900831164320536010720482401998IN500/1000FEMALEcampingMulti-vehicle CollisionSide CollisionTotal LossAmbulanceVANorthbendYESUNKNOWNAccuraMDXN
796421567280255001935400000047082649500-81100712392730168608430674402004IN100/300MALEreadingSingle Vehicle CollisionRear CollisionMajor DamageAmbulanceNYHillsdaleUNKNOWNUNKNOWNMercedesE400Y
4254344243585001282061612600040066880608012160486401996OH500/1000FEMALEbasketballMulti-vehicle CollisionSide CollisionMajor DamagePoliceWVNorthbrookUNKNOWNUNKNOWNChevroletSilveradoY
9912574410939210001280043398159400-32200211014698005220417602002OH100/300MALEbasketballSingle Vehicle CollisionRear CollisionTotal LossOtherWVRiverwoodNONOAccuraTLN
9241353091346420001341060170137100-46500183013267059402970237602003IN500/1000FEMALEskydivingMulti-vehicle CollisionRear CollisionMinor DamageAmbulanceWVRiverwoodNONOHondaAccordN
8281052886680550010820452216001232260500121006050423501995OH250/500FEMALEgolfMulti-vehicle CollisionRear CollisionMajor DamageFireSCRiverwoodNONOAudiA5N
36429146832746100099404527010-553008122558062062043402005OH500/1000FEMALEpoloParked CarUNKNOWNMinor DamagePoliceSCHillsdaleNOYESVolkswagenPassatY
683238434440351000152440000006074580-44800214004270042704270341601995OH250/500MALEchessMulti-vehicle CollisionRear CollisionTotal LossAmbulanceNCHillsdaleNONOSaab92xY
242190403903812000965061035436900-5370010121630063063050402001OH500/1000FEMALEcampingParked CarUNKNOWNTrivial DamageNoneSCHillsdaleUNKNOWNYESNissanUltimaN
months_as_customeragepolicy_numberpolicy_deductablepolicy_annual_premiumumbrella_limitinsured_zipcapital-gainscapital-lossincident_hour_of_the_daynumber_of_vehicles_involvedbodily_injurieswitnessestotal_claim_amountinjury_claimproperty_claimvehicle_claimauto_yearpolicy_statepolicy_cslinsured_sexinsured_hobbiesincident_typecollision_typeincident_severityauthorities_contactedincident_stateincident_cityproperty_damagepolice_report_availableauto_makeauto_modelfraud_reported
315256428604971000128604605640-395001412175500755015100528501998IL500/1000FEMALEbungie-jumpingSingle Vehicle CollisionRear CollisionMinor DamagePoliceWVNorthbendNOUNKNOWNNissanUltimaN
472360514843211000115204346690-6240015320904801508015080603202000IL250/500MALEhikingMulti-vehicle CollisionSide CollisionTotal LossAmbulanceWVRiverwoodNOYESBMWX6N
79934244824042000155004356320-2770020101396066066026401998IN500/1000FEMALEdancingParked CarUNKNOWNTrivial DamageNoneVAHillsdaleYESUNKNOWNAudiA3N
4473525399384050017930619166002312268750125006250500002009IL250/500MALEexerciseSingle Vehicle CollisionFront CollisionMinor DamageFireNCRiverwoodYESNOChevroletMalibuN
923903152421520009510607131421000211075790137806890551202007OH250/500FEMALEhikingSingle Vehicle CollisionRear CollisionTotal LossOtherSCHillsdaleYESYESAccuraRSXN
67827645749325500948043062144500-614001130269300138606930485102010IL500/1000FEMALEreadingMulti-vehicle CollisionFront CollisionMinor DamageFireSCColumbusUNKNOWNUNKNOWNFordEscapeN
78916934725330500146904581320-5760001003870077403870270902012IN100/300FEMALEreadingSingle Vehicle CollisionRear CollisionMinor DamageFireVAArlingtonUNKNOWNYESVolkswagenPassatN
2402494354780210001518060623800161005350053505350428002015IL250/500FEMALEcross-fitSingle Vehicle CollisionFront CollisionMajor DamageFireSCRiverwoodUNKNOWNYESSaab92xN
263244402267252000130470000006054080-45000531161490559011180447202001IN500/1000MALEbase-jumpingMulti-vehicle CollisionSide CollisionMinor DamagePoliceSCHillsdaleUNKNOWNUNKNOWNDodgeRAMN
59416038497929500173304414250-438001332166780742014840445201996OH250/500MALEsleepingMulti-vehicle CollisionSide CollisionMajor DamageFireSCHillsdaleNOYESMercedesML350N